Finding Unique Filter Sets in PLATO: A Precursor to Efficient Interaction Analysis in GWAS Data

نویسندگان

  • Benjamin J. Grady
  • Eric Torstenson
  • Scott M. Dudek
  • Justin Giles
  • David Sexton
  • Marylyn D. Ritchie
چکیده

The methods to detect gene-gene interactions between variants in genome-wide association study (GWAS) datasets have not been well developed thus far. PLATO, the Platform for the Analysis, Translation and Organization of large-scale data, is a filter-based method bringing together many analytical methods simultaneously in an effort to solve this problem. PLATO filters a large, genomic dataset down to a subset of genetic variants, which may be useful for interaction analysis. As a precursor to the use of PLATO for the detection of gene-gene interactions, the implementation of a variety of single locus filters was completed and evaluated as a proof of concept. To streamline PLATO for efficient epistasis analysis, we determined which of 24 analytical filters produced redundant results. Using a kappa score to identify agreement between filters, we grouped the analytical filters into 4 filter classes; thus all further analyses employed four filters. We then tested the MAX statistic put forth by Sladek et al. (1) in simulated data exploring a number of genetic models of modest effect size. To find the MAX statistic, the four filters were run on each SNP in each dataset and the smallest p-value among the four results was taken as the final result. Permutation testing was performed to empirically determine the p-value. The power of the MAX statistic to detect each of the simulated effects was determined in addition to the Type 1 error and false positive rates. The results of this simulation study demonstrates that PLATO using the four filters incorporating the MAX statistic has higher power on average to find multiple types of effects and a lower false positive rate than any of the individual filters alone. In the future we will extend PLATO with the MAX statistic to interaction analyses for large-scale genomic datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fuzzy-rough Information Gain Ratio Approach to Filter-wrapper Feature Selection

Feature selection for various applications has been carried out for many years in many different research areas. However, there is a trade-off between finding feature subsets with minimum length and increasing the classification accuracy. In this paper, a filter-wrapper feature selection approach based on fuzzy-rough gain ratio is proposed to tackle this problem. As a search strategy, a modifie...

متن کامل

ALTERNATIVE MIXED INTEGER PROGRAMMING FOR FINDING EFFICIENT BCC UNIT

Data Envelopment Analysis (DEA) cannot provide adequate discrimination among efficient decision making units (DMUs). To discriminate these efficient DMUs is an interesting research subject. The purpose of this paper is to develop the mix integer linear model which was proposed by Foroughi (Foroughi A.A. A new mixed integer linear model for selecting the best decision making units in data envelo...

متن کامل

Finding Common Weights in Two-Stage Network DEA

In data envelopment analysis (DEA), mul-tiplier and envelopment CCR models eval-uate the decision-making units (DMUs) under optimal conditions. Therefore, the best prices are allocated to the inputs and outputs. Thus, if a given DMU was not efficient under optimal conditions, it would not be considered efficient by any other models. In the current study, using common weights in DEA, a number of...

متن کامل

Some Conditions for Characterizing Minimum Face in Non-Radial DEA Models with Undesirable Outputs

The problem of utilizing undesirable (bad) outputs in DEA models often need replacing the assumption of free disposability of outputs by weak disposability of outputs. The Kuosmanen technology is the only correct representation of the fully convex technology exhibiting weak disposability of bad and good outputs. Also, there are some specific features of non-radial data envelopment analysis (DEA...

متن کامل

Study of Clear Air Turbulence over Iranian Plato

This study was carried out using two sets of numerical weather forecast data and flight reports for Clear Air Turbulence (CAT) over Iranian Plato to find atmospheric flow patterns favorable to the formation of CAT. The numerical data include five months of AVN analysis with horizontal resolution of 1 degree(about 100 km) and four months forecast data of MM5 model with resolution of 50 km. Impor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing

دوره   شماره 

صفحات  -

تاریخ انتشار 2010